Autonomous Personal Filtering Improves Global Spam Filter Performance
نویسنده
چکیده
Using two email streams, we show that a personal filter trained exclusively on user feedback substantially outperforms (p ≈ 0.000) three industry-leading global spam filters not using feedback. We show that autonomous personal filters, trained on the output from a global spam filter rather than user feedback, substantially outperform (p ≈ 0.000) the global filter, if by a somewhat smaller factor than userfeedback-trained personal filters. To our knowledge, no controlled quantitative study addressing these questions has previously been reported.
منابع مشابه
SPONGY (SPam ONtoloGY): Email Classification Using Two-Level Dynamic Ontology
Email is one of common communication methods between people on the Internet. However, the increase of email misuse/abuse has resulted in an increasing volume of spam emails over recent years. An experimental system has been designed and implemented with the hypothesis that this method would outperform existing techniques, and the experimental results showed that indeed the proposed ontology-bas...
متن کاملIntroduction of Fingerprint Vector based Bayesian Method for Spam Filtering
With the development of the diversification of spam, it raises the difficulties and challenges to content-based spam filtering. To address this problem, this paper firstly introduced the statistical features of Email headers, and then proposed a method to use these features to improve Bayesian anti-spam filter. The selected Email-header features are presented as the fingerprint vectors, and the...
متن کاملIBM SpamGuru on the TREC 2005 Spam Track
IBM Research is developing an enterpriseclass anti-spam filter as part of our overall strategy of attacking the Spam problem on multiple fronts. Our anti-spam filter, SpamGuru, mirrors this philosophy by incorporating several different filtering technologies and intelligently combining their output to produce a single spamminess rating. The use of multiple algorithms improves the system’s effec...
متن کاملA Survey on Machine Learning Methods in Spam Filtering
Email spam or junk e-mail (unwanted e-mail “usually of a commercial nature sent out in bulk”) is one of the major issue of the today's Internet, that cause financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is an important and popular one. Common uses for mail filters include organizing incoming email and removal of spam and compu...
متن کاملAN EVALUATION OF FILTERING TECHNIQUES IN A NAÏVE BAYESIAN ANTI-SPAM FILTER by
An efficient anti-spam filter that would block all unsolicited messages i.e. spam, without blocking any legitimate messages is a growing need. To address this problem, this report takes a statistically-based approach, employing a Bayesian anti-spam filter, because it is content-based and self-learning (adaptive) in nature. We train the filter, using a large corpus of legitimate messages and spa...
متن کامل